56 research outputs found

    The Importance of Modularity in Bioinformatics Tools

    Get PDF
    In the last decade the amount of Bioinformatics tools has increased enormously. There are tools to store, analyse, visualize, edit or generate biological data and there are still more in development. Still, the demand for increased functionality in a single piece of software must be balanced by the need for modularity to keep the software maintainable. In complex systems, the conflicting demands of features and maintainability are often solved by plug-in systems.

For example Cytoscape, an open source platform for Complex-Network Analysis and Visualization, is using a plug-in system to allow the extension of the application without changing the core. This not only allows the integration of new functionality without a new release but offers the possibility for other developers to contribute plug-ins which are needed in their research.

Most tools have their own, individual plug-in system to meet the needs of the application. These are often very simple and easy to use. However, the increasing complexity of plug-ins demands more functionality of the plug-in system. We want to reuse components in different contexts, we want to have simple plug-in interfaces and we want to allow communication and dependencies between plug-ins. Many tools implemented in Java are facing these problems and there seems to be a common solution: the integration of an established modularity framework, like OSGi. To our knowledge, a number of developers of bioinformatics tools are already implementing, planning or thinking about the integration of OSGi into their applications, e.g. Cytoscape, Protege, PathVisio, ImageJ, Jalview or Chipster. The adoption of modularity frameworks in the development of bioinformatics applications is steadily increasing and should be considered in the design of new software.

By modularity in the traditional computer science sense, we mean the division of a software application into logical parts with separate concerns. To ease the development of software tools the application is separated into smaller logical parts, which are implemented individually. A set of modules can form a larger application but only if a proper glue is used, OSGi is an example of such a glue. OSGi allows to build an infrastructure into an application to add and use different modules. It provides mechanisms to allow the individual modules to rely on and interact with each other, opening the possibility to put together different modules to solve the problem at hand. Later, modules can be removed and new ones can be added to tackle another problem. As Katy Boerner in her article 'Plug-and-Play Macroscopes' writes, we should 'implement software frameworks that empower domain scientists to assemble their own continuously evolving macroscopes, adding and upgrading existing (and removing obsolete) plug-ins to arrive at a set that is truly relevant for their work'.

Some of these modules are going to be specific for one application but a lot of these modules can actually be reused by other tools. We are talking about general features like the import or export of different file formats, a layout algorithm that could be used by several visualization tools or the lookup in an external online database. Why should every tool implement its own parser or algorithm? Modularity can help to share functionality. There is no need to start from scratch and implement everything anew, thus developers can focus on new and important features.

Adding modularity, or better, a modularity framework to an existing software application is not a trivial task. The developers of Cytoscape are currently undertaking this challenge with the coming version 3. We are also working on the integration of OSGi into our pathway visualization tool PathVisio and we now want to share and compare our experiences, so others can benefit from our discoveries. This will not only help them in making a decision if OSGi is a suitable solution for them but also in the integration process itself

    WikiPathways: building research communities on biological pathways.

    Get PDF
    Here, we describe the development of WikiPathways (http://www.wikipathways.org), a public wiki for pathway curation, since it was first published in 2008. New features are discussed, as well as developments in the community of contributors. New features include a zoomable pathway viewer, support for pathway ontology annotations, the ability to mark pathways as private for a limited time and the availability of stable hyperlinks to pathways and the elements therein. WikiPathways content is freely available in a variety of formats such as the BioPAX standard, and the content is increasingly adopted by external databases and tools, including Wikipedia. A recent development is the use of WikiPathways as a staging ground for centrally curated databases such as Reactome. WikiPathways is seeing steady growth in the number of users, page views and edits for each pathway. To assess whether the community curation experiment can be considered successful, here we analyze the relation between use and contribution, which gives results in line with other wiki projects. The novel use of pathway pages as supplementary material to publications, as well as the addition of tailored content for research domains, is expected to stimulate growth further

    BridgeDb: standardized access to gene, protein and metabolite identifier mapping services

    Get PDF
    Many interesting problems in bioinformatics require integration of data from various sources. For example when combining microarray data with a pathway database, or merging co-citation networks with protein-protein interaction networks. Invariably this leads to an identifier mapping problem, where different datasets are annotated with identifiers that are related, but originate from different databases.

Solutions for the identifier mapping problem exist, such as Biomart, Synergizer, Cronos, PICR, HMS and many more. This creates an opportunity for bioinformatics tool developers. Tools can be made to flexibly support multiple mapping services or mapping services could be combined to get broader coverage. This approach requires an interface layer between tools and mapping services. BridgeDb provides such an interface layer, in the form of both a Java and REST API.

Because of the standardized interface layer, BridgeDb is not tied to a specific source of mapping information. You can switch easily between flat files, relational databases and several different web services. Mapping services can be combined to support multi-omics experiments or to integrate custom microarray annotations. BridgeDb isn't just yet another mapping service: it tries to build further on existing work, and integrate multiple partial solutions. The framework is intended for customization and adaptation to any identifier mapping service. 

BridgeDb makes it easy to add an important capability to existing tools. BridgeDb has already been integrated into several popular bioinformatics applications, such as Cytoscape, WikiPathways, PathVisio, Vanted and Taverna. To encourage tool developers to start using BridgeDb, we've created code examples, online documentation, and a mailinglist to ask questions. 

We believe that, to meet the challenges that are encountered in bioinformatics today, the software development process should follow a few essential principles: user friendliness, code reuse, modularity and open source. BridgeDb adheres to these principles, and can serve as a useful model for others to follow. BridgeDb can function to increase user-friendliness of graphical applications. It re-uses work from other projects such as BioMart and MIRIAM. BridgeDb consists of several small modules, integrated through a common interface (API). Components of BridgeDb can be left out or replaced, for maximum flexibility. BridgeDb was open source from the very beginning of the project. The philosophy of open source is closely aligned to academic values, of building on top of the work of giants. 

Many interesting problems in bioinformatics require integration of data from various sources. For example when combining microarray data with a pathway database, or merging co-citation networks with protein-protein interaction networks. Invariably this leads to an identifier mapping problem, where different datasets are annotated with identifiers that are related, but originate from different databases.

Solutions for the identifier mapping problem exist, such as Biomart, Synergizer, Cronos, PICR, HMS and many more. This creates an opportunity for bioinformatics tool developers. Tools can be made to flexibly support multiple mapping services or mapping services could be combined to get broader coverage. This approach requires an interface layer between tools and mapping services. BridgeDb provides such an interface layer, in the form of both a Java and REST API.

Because of the standardized interface layer, BridgeDb is not tied to a specific source of mapping information. You can switch easily between flat files, relational databases and several different web services. Mapping services can be combined to support multi-omics experiments or to integrate custom microarray annotations. BridgeDb isn't just yet another mapping service: it tries to build further on existing work, and integrate multiple partial solutions. The framework is intended for customization and adaptation to any identifier mapping service. 

BridgeDb makes it easy to add an important capability to existing tools. BridgeDb has already been integrated into several popular bioinformatics applications, such as Cytoscape, WikiPathways, PathVisio, Vanted and Taverna. To encourage tool developers to start using BridgeDb, we've created code examples, online documentation, and a mailinglist to ask questions. 

We believe that, to meet the challenges that are encountered in bioinformatics today, the software development process should follow a few essential principles: user friendliness, code reuse, modularity and open source. BridgeDb adheres to these principles, and can serve as a useful model for others to follow. BridgeDb can function to increase user-friendliness of graphical applications. It re-uses work from other projects such as BioMart and MIRIAM. BridgeDb consists of several small modules, integrated through a common interface (API). Components of BridgeDb can be left out or replaced, for maximum flexibility. BridgeDb was open source from the very beginning of the project. The philosophy of open source is closely aligned to academic values, of building on top of the work of giants. 

The BridgeDb library is available at "http://www.bridgedb.org":http://www.bridgedb.org.
A paper about BridgeDb was published in BMC _Bioinformatics_, 2010 Jan 4;11(1):5.

BridgeDb blog: "http://www.helixsoft.nl/blog/?tag=bridgedb":http://www.helixsoft.nl/blog/?tag=bridged

    WikiPathways: Pathway Editing for the People

    Get PDF
    WikiPathways provides a collaborative platform for creating, updating, and sharing pathway diagrams and serves as an example of content curation by the biology community

    Mining Biological Pathways Using WikiPathways Web Services

    Get PDF
    WikiPathways is a platform for creating, updating, and sharing biological pathways [1]. Pathways can be edited and downloaded using the wiki-style website. Here we present a SOAP web service that provides programmatic access to WikiPathways that is complementary to the website. We describe the functionality that this web service offers and discuss several use cases in detail. Exposing WikiPathways through a web service opens up new ways of utilizing pathway information and assisting the community curation process

    Bioinformatics for the NuGO proof of principle study: analysis of gene expression in muscle of ApoE3*Leiden mice on a high-fat diet using PathVisio

    Get PDF
    Insulin resistance is a characteristic of type-2 diabetes and its development is associated with an increased fat consumption. Muscle is one of the tissues that becomes insulin resistant after high fat (HF) feeding. The aim of the present study is to identify processes involved in the development of HF-induced insulin resistance in muscle of ApOE3*Leiden mice by using microarrays. These mice are known to become insulin resistant on a HF diet. Differential gene expression was measured in muscle using the Affymetrix mouse plus 2.0 array. To get more insight in the processes, affected pathway analysis was performed with a new tool, PathVisio. PathVisio is a pathway editor customized with plug-ins (1) to visualize microarray data on pathways and (2) to perform statistical analysis to select pathways of interest. The present study demonstrated that with pathway analysis, using PathVisio, a large variety of processes can be investigated. The significantly regulated genes in muscle of ApOE3*Leiden mice after 12 weeks of HF feeding were involved in several biological pathways including fatty acid beta oxidation, fatty acid biosynthesis, insulin signaling, oxidative stress and inflammation

    The BridgeDb framework: standardized access to gene, protein and metabolite identifier mapping services

    Get PDF
    BACKGROUND: Many complementary solutions are available for the identifier mapping problem. This creates an opportunity for bioinformatics tool developers. Tools can be made to flexibly support multiple mapping services or mapping services could be combined to get broader coverage. This approach requires an interface layer between tools and mapping services. RESULTS: Here we present BridgeDb, a software framework for gene, protein and metabolite identifier mapping. This framework provides a standardized interface layer through which bioinformatics tools can be connected to different identifier mapping services. This approach makes it easier for tool developers to support identifier mapping. Mapping services can be combined or merged to support multi-omics experiments or to integrate custom microarray annotations. BridgeDb provides its own ready-to-go mapping services, both in webservice and local database forms. However, the framework is intended for customization and adaptation to any identifier mapping service. BridgeDb has already been integrated into several bioinformatics applications. CONCLUSION: By uncoupling bioinformatics tools from mapping services, BridgeDb improves capability and flexibility of those tools. All described software is open source and available at http://www.bridgedb.org

    SBML qualitative models: a model representation format and infrastructure to foster interactions between qualitative modelling formalisms and tools

    Get PDF
    Background: Qualitative frameworks, especially those based on the logical discrete formalism, are increasingly used to model regulatory and signalling networks. A major advantage of these frameworks is that they do not require precise quantitative data, and that they are well-suited for studies of large networks. While numerous groups have developed specific computational tools that provide original methods to analyse qualitative models, a standard format to exchange qualitative models has been missing. Results: We present the Systems Biology Markup Language (SBML) Qualitative Models Package (“qual”), an extension of the SBML Level 3 standard designed for computer representation of qualitative models of biological networks. We demonstrate the interoperability of models via SBML qual through the analysis of a specific signalling network by three independent software tools. Furthermore, the collective effort to define the SBML qual format paved the way for the development of LogicalModel, an open-source model library, which will facilitate the adoption of the format as well as the collaborative development of algorithms to analyse qualitative models. Conclusions: SBML qual allows the exchange of qualitative models among a number of complementary software tools. SBML qual has the potential to promote collaborative work on the development of novel computational approaches, as well as on the specification and the analysis of comprehensive qualitative models of regulatory and signalling networks

    The systems biology format converter

    Get PDF
    BACKGROUND: Interoperability between formats is a recurring problem in systems biology research. Many tools have been developed to convert computational models from one format to another. However, they have been developed independently, resulting in redundancy of efforts and lack of synergy. RESULTS: Here we present the System Biology Format Converter (SBFC), which provide a generic framework to potentially convert any format into another. The framework currently includes several converters translating between the following formats: SBML, BioPAX, SBGN-ML, Matlab, Octave, XPP, GPML, Dot, MDL and APM. This software is written in Java and can be used as a standalone executable or web service. CONCLUSIONS: The SBFC framework is an evolving software project. Existing converters can be used and improved, and new converters can be easily added, making SBFC useful to both modellers and developers. The source code and documentation of the framework are freely available from the project web site. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1000-2) contains supplementary material, which is available to authorized users

    The Prospective Dutch Colorectal Cancer (PLCRC) cohort: real-world data facilitating research and clinical care

    Get PDF
    Real-world data (RWD) sources are important to advance clinical oncology research and evaluate treatments in daily practice. Since 2013, the Prospective Dutch Colorectal Cancer (PLCRC) cohort, linked to the Netherlands Cancer Registry, serves as an infrastructure for scientific research collecting additional patient-reported outcomes (PRO) and biospecimens. Here we report on cohort developments and investigate to what extent PLCRC reflects the “real-world”. Clinical and demographic characteristics of PLCRC participants were compared with the general Dutch CRC population (n = 74,692, Dutch-ref). To study representativeness, standardized differences between PLCRC and Dutch-ref were calculated, and logistic regression models were evaluated on their ability to distinguish cohort participants from the Dutch-ref (AU-ROC 0.5 = preferred, implying participation independent of patient characteristics). Stratified analyses by stage and time-period (2013–2016 and 2017–Aug 2019) were performed to study the evolution towards RWD. In August 2019, 5744 patients were enrolled. Enrollment increased steeply, from 129 participants (1 hospital) in 2013 to 2136 (50 of 75 Dutch hospitals) in 2018. Low AU-ROC (0.65, 95% CI: 0.64–0.65) indicates limited ability to distinguish cohort participants from the Dutch-ref. Characteristics that remained imbalanced in the period 2017–Aug’19 compared with the Dutch-ref were age (65.0 years in PLCRC, 69.3 in the Dutch-ref) and tumor stage (40% stage-III in PLCRC, 30% in the Dutch-ref). PLCRC approaches to represent the Dutch CRC population and will ultimately meet the current demand for high-quality RWD. Efforts are ongoing to improve multidisciplinary recruitment which will further enhance PLCRC’s representativeness and its contribution to a learning healthcare system
    corecore